Skip to content

Migrate ngmix PSF/shape column names (tracking shapepipe#761)#201

Open
cailmdaley wants to merge 11 commits into
developfrom
migrate/ngmix-psf-column-names
Open

Migrate ngmix PSF/shape column names (tracking shapepipe#761)#201
cailmdaley wants to merge 11 commits into
developfrom
migrate/ngmix-psf-column-names

Conversation

@cailmdaley

@cailmdaley cailmdaley commented Jun 20, 2026

Copy link
Copy Markdown
Collaborator

Closes #218

The sp_validation (consumer) half of the ShapePipe column-grammar unification (CosmoStat/shapepipe#761, landing in #741 / ngmix v2.0). ShapePipe now writes every shape and PSF column under one grammar — ESTIMATOR_COMPONENT[_ERR][_OBJECT]_SHEAR, uppercase, galaxy implicit (no _GAL token) — and stores a single size T = 2σ². This PR moves every read in sp_validation onto that grammar and drops the machinery the new grammar makes obsolete.

The authoritative old→new map is docs/ngmix_psf_column_migration.md (rewritten here). In brief:

Old New
NGMIX_ELL_{s} / _ELL_ERR_{s} (2-vec) NGMIX_G1/G2_{s} / _G1/G2_ERR_{s} (named scalars)
NGMIX_Tpsf_{s} NGMIX_T_PSF_RECONV_{s}
NGMIX_ELL_PSFo_{s}_0/1, NGMIX_T_PSFo_{s} NGMIX_G1/G2_PSF_ORIG_{s}, NGMIX_T_PSF_ORIG_{s}
E1/2_{PSF,STAR}_HSM HSM_G1/G2_{PSF,STAR}
SIGMA_*_HSM HSM_T_* (now the area T = 2σ²)
FLAG_*_HSM HSM_FLAG_*

Beyond the renames

  • Sizes are T = 2σ². The square_size machinery is retired (config flags, not_square_size, both param builders default to False) and the hand-rolled SIGMA_*_HSM ** 2 squaring is gone from the paper scripts — which also fixes a latent factor error where SP_v1.3 squared an already-T value. Any σ / FWHM / r50 now comes from cs_util.size, the single source of truth for size math.
  • spread_model removed. ShapePipe stopped writing SPREAD_*; the do_spread_model branch and match_spread_class are gone, and star/galaxy classification uses the size-based path.
  • galsim consumer path removed. The GALSIM_* reads were unreachable — shape is hardcoded to ngmix and nothing can produce galsim columns — so the branch is deleted and unknown estimator prefixes now fail loudly. ngmix is the sole estimator.
  • NGMIX_MOM_FAILNGMIX_MCAL_TYPES_FAIL. The producer reused the slot; sp_validation cuts on == 0 either way, so the read migrates cleanly.
  • API cleanup. The obsolete col_2d flag is dropped from metacal / get_variance_ivweights; match_subsample takes two scalar keys; check_invalid drops comp.
  • No e→g conversion. Both estimators store native g, so the ellipticity-type step Column-grammar migration: adopt ShapePipe-v2 column names (ngmix_v2.0) #218 flagged isn't needed.

One value change to watch

NGMIX_*_PSF_ORIG now carries a true original-PSF fit (CosmoStat/shapepipe#749), not the reconvolved-kernel alias the old ELL_PSFo / T_PSFo columns silently held. At the code level this is a straight rename; the look-at-the-numbers check on the α-leakage / size-ratio cuts happens once a v2 catalogue exists — analysis, not code, and it doesn't gate this PR.

Merge coordination

Merging makes develop require v2 columns and stop reading today's catalogues, so this should land with (or just after) CosmoStat/shapepipe#761 → #741 and the first regenerated catalogue.

shear_psf_leakage #27 (separate repo, Sacha) removes square_size from the ρ/τ internals; this PR sets square_size=False and passes the T-columns, correct independently of that migration.

The in-container test suite is green (137 passed; the lone failure is a pre-existing matplotlib TeX-font gap, unrelated to columns).

— Claude on behalf of Cail

Downstream of CosmoStat/shapepipe#761, which makes the ngmix output columns a
single source of truth (true original-PSF fit stored separately, ELL split into
scalar G1/G2, ESTIMATOR_COMPONENT_OBJECT grammar). Captures the old->new column
map and the sp_validation consumer checklist. Blocked on #761 landing.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
cailmdaley and others added 4 commits July 1, 2026 14:27
The plan committed 2026-06-20 predated the "drop GAL" decision on
shapepipe#761: it showed every galaxy column gaining a _GAL token and
said nothing about HSM. Rewrite it to the authoritative as-shipped
grammar, grounded in the #761 producer source:

- galaxy is the implicit object (no _GAL token); ellipticity splits into
  named scalars NGMIX_G1/G2_{shear}, errors likewise;
- NGMIX_Tpsf -> NGMIX_T_PSF_RECONV (value-safe); ELL_PSFo/T_PSFo ->
  G1/G2/T_PSF_ORIG (value change, shapepipe#749);
- HSM E*_{PSF,STAR} -> HSM_G*_{PSF,STAR} (native g, pure rename),
  SIGMA_*_HSM -> HSM_T_* (units: now T=2sigma^2), FLAG_*_HSM -> HSM_FLAG_*;
- spread_model removed; sizes route through cs_util.size; square_size
  retired. e->g conversion is not needed anywhere (both estimators store g).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8BwwtExg961NknUeu5GCv
Consumer half of shapepipe#761. Every shape-column read in the package
moves onto the one grammar:

- metacal (calibration.py): galaxy ellipticity read as named scalars
  NGMIX_G1/G2_{shear} and errors NGMIX_G1/G2_ERR_{shear} (was packed
  NGMIX_ELL_*[:, c]); NGMIX_Tpsf -> NGMIX_T_PSF_RECONV. The obsolete
  col_2d flag (packed-vs-split ellipticity) is removed from the metacal
  ctor and get_variance_ivweights — v2 is always named scalars.
- galaxy.py: NGMIX_ELL_PSFo[:, 0] -> NGMIX_G1_PSF_ORIG scalar; the
  spread_model branch/param dropped (shapepipe stopped writing it) —
  classification falls back to the size-based path.
- catalog.py: match_spread_class deleted (SPREAD_CLASS gone);
  match_subsample takes two scalar keys (g1_key, g2_key) instead of a
  packed ell_key; check_invalid drops the now-meaningless comp arg.
- glass_mock.py: HSM E1/E2_PSF_HSM -> HSM_G1/G2_PSF.
- rho_tau.py: sizes are always T=2sigma^2 now, so square_size is False
  and the not_square_size list is gone.
- tests: synthetic catalogues + configs rebuilt on the v2 grammar in
  lock-step (the migration's internal-consistency check).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8BwwtExg961NknUeu5GCv
…_size

- cat_config.yaml: every HSM block onto the v2 grammar — E*_{PSF,STAR}_HSM
  -> HSM_G*_{PSF,STAR}, SIGMA_*_HSM / T_*_HSM -> HSM_T_*, FLAG_*_HSM ->
  HSM_FLAG_*. The DES/piff block is untouched (Piff isn't this grammar).
  The dead per-dataset square_size flags are dropped: sizes are always
  T=2sigma^2 now, so nothing squares (this also fixes a latent factor
  error in the SP_v1.3 block, which squared its already-T T_PSF_HSM).
- mask_v1.X.{2..11}.yaml: NGMIX_ELL_PSFo_NOSHEAR_0/_1 ->
  NGMIX_G1/G2_PSF_ORIG_NOSHEAR.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8BwwtExg961NknUeu5GCv
…mmar

Non-library consumers onto the same grammar:

- scripts/calibration/{extract_info,params,calibrate_comprehensive_cat}.py:
  Tpsf -> T_PSF_RECONV; ELL_PSFo -> G1/G2_PSF_ORIG; T_PSFo -> T_PSF_ORIG;
  bare galaxy ELL -> G1/G2 scalars (match_subsample/check_invalid callers
  updated for the new signatures); HSM E*_PSF -> HSM_G*_PSF; spread_model
  wiring removed; col_2d args dropped.
- scripts/apply_alpha_snr_size_bin.py, scripts/examples/demo_*.py: same
  Tpsf/PSFo renames, col_2d dropped.
- cosmo_val/compute_theory_cov.py, cosmo_inference/scripts/masking.py:
  Tpsf -> T_PSF_RECONV; square_size hardcoded False.
- papers/catalog/*, papers/harmonic/*: HSM ellipticity renames and the
  sigma->T units fix (hand-rolled SIGMA_*_HSM**2 -> read HSM_T_* directly).

Out of scope, left as-is: GALSIM_* columns, scratch/, and the
cosmo_inference notebooks (own migration fiber).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01M8BwwtExg961NknUeu5GCv
@cailmdaley cailmdaley force-pushed the migrate/ngmix-psf-column-names branch from 01f9c8f to c84e404 Compare July 1, 2026 12:29
cailmdaley and others added 5 commits July 1, 2026 15:00
Two tightening-pass changes to complete ShapePipe-v2 column adoption in
the package:

- Remove the galsim estimator path as dead code. `shape` is hardcoded to
  "ngmix", extract_info.py raises for any other value, and nothing outside
  scratch/ instantiates metacal(prefix="GALSIM") or calls
  classification_galaxy_galsim. Migrating it would carry an untestable path
  whose shared `col_1p = {prefix}_T_PSF_RECONV_1P` read never matched the
  galsim producer output (GALSIM_T_PSF_*, not ..._T_PSF_RECONV_*) — already
  broken. Removed: metacal._read_data_galsim, the prefix=="GALSIM" dispatch
  (now else: raise — unknown prefixes fail loudly), the two galsim
  ellipticity sign flips in _shear_response/_selection_response,
  galaxy.classification_galaxy_galsim, the sh=="galsim" branch in
  catalog.get_snr (now else: raise), and the unused shape_method arg on
  get_calibrated_quantities/get_calibrated_m_c. ngmix is the sole estimator;
  the `prefix` param stays (names the column family; a future NGMIXm moments
  family could reuse it).

- Rename the galaxy failure-flag read NGMIX_MOM_FAIL -> NGMIX_MCAL_TYPES_FAIL
  in classification_galaxy_ngmix. The v1 column counted moments-initial-guess
  failures (get_guess, gone in v2); the producer reused the slot for a
  failed-metacal-types count. sp_validation cuts on == 0 either way (keep
  fully-measured objects), so the read migrates cleanly; the underlying
  failure mode changed, so this is the first line to check if the post-cut
  galaxy count looks off against a regenerated v2 catalogue.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019f8w2twg4b3Ga36dSxNbLW
…figs

The v1 failure-flag column no longer exists in the ShapePipe-v2 header; the
producer renamed the slot to NGMIX_MCAL_TYPES_FAIL. Propagate the rename into
all ten calibration mask configs so the flag column resolves against a v2
catalogue.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019f8w2twg4b3Ga36dSxNbLW
- NGMIX_MOM_FAIL -> NGMIX_MCAL_TYPES_FAIL in params.add_cols_pre_cal (and its
  int-format set), the two demo column lists, and hist_mag's read lists.
- Drop the galsim mentions left in params.py (shape comment) and
  extract_info.py (the stale "cuts common to ngmix and galsim" comment, whose
  spread-model line also went with the earlier spread_model removal).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019f8w2twg4b3Ga36dSxNbLW
cfis_analysis.ipynb read E1/E2_PSF_HSM into treecorr for the C_sys PSF test;
rename to HSM_G1/G2_PSF (native g, pure rename). A sweep of all tracked
notebooks found no other old column tokens.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019f8w2twg4b3Ga36dSxNbLW
Update the authoritative migration map for the tightening pass:
- *_PSF_ORIG reframed as a value change that is NOT a code blocker (the rename
  is correct as-is; a v2 catalogue only enables a look-at-numbers check). The
  real merge gate is cutover timing — merging makes develop require v2 columns.
- NGMIX_MOM_FAIL -> NGMIX_MCAL_TYPES_FAIL added to the ngmix map with a
  semantics-change note.
- galsim path documented as removed dead code (was "left untouched, flagged").
- notebooks added to the consumer-sites list.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_019f8w2twg4b3Ga36dSxNbLW
@review-notebook-app

Copy link
Copy Markdown

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

shear_psf_leakage #27 removed the square_size parameter from build_cat_to_compute_{rho,tau} and CovTauTh (HSM_T_* already stores the area T=2σ², so the flag was dead weight). sp_validation still threaded square_size into those leakage calls; once #27's container lands, passing it would raise TypeError. Drop the square_size key from both param builders (rho_tau.get_params_rho_tau, cosmo_val/compute_theory_cov.py) and stop passing it to the handlers. Behavior-identical against the current container (square_size defaults to False), required for the post-#27 one.

Also correct the migration doc: galsim shapes are not producible by ShapePipe (no shape-measurement runner; make_cat's galsim mode reads an external catalogue nothing generates; production configs pin ngmix) — the decisive reason the dead consumer path was removed rather than migrated; and reframe the shear_psf_leakage coordination now that #27 is a sibling PR landing the same grammar.

Suite: 137 passed, 1 skipped (unrelated cmss12.tfm font gap).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01GnCCTxKE13cHzPr8BUeKZE
@cailmdaley cailmdaley marked this pull request as ready for review July 1, 2026 16:41
@cailmdaley cailmdaley requested a review from sachaguer July 1, 2026 16:41
@@ -18,7 +18,6 @@ def get_params_rho_tau(cat, survey="other"):
params["e2_star_col"] = cat["psf"]["e2_star_col"]
params["PSF_size"] = cat["psf"]["PSF_size"]
params["star_size"] = cat["psf"]["star_size"]

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removing this field here is correct but needs an associated PR in shear_psf_leakage. The current default in the develop branch there squares the size. I can fix it easily.

@sachaguer sachaguer left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants